LIPN: Introducing a new Geographical Context Similarity Measure and a Statistical Similarity Measure based on the Bhattacharyya coefficient

نویسندگان

  • Davide Buscaldi
  • Jorge García Flores
  • Joseph Le Roux
  • Nadi Tomeh
  • Belém Priego Sanchez
چکیده

This paper describes the system used by the LIPN team in the task 10, Multilingual Semantic Textual Similarity, at SemEval 2014, in both the English and Spanish sub-tasks. The system uses a support vector regression model, combining different text similarity measures as features. With respect to our 2013 participation, we included a new feature to take into account the geographical context and a new semantic distance based on the Bhattacharyya distance calculated on cooccurrence distributions derived from the Spanish Google Books n-grams dataset.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Similarity Measure Based on Item Proximity and Closeness for Collaborative Filtering Recommendation

Recommender systems utilize information retrieval and machine learning techniques for filtering information and can predict whether a user would like an unseen item. User similarity measurement plays an important role in collaborative filtering based recommender systems. In order to improve accuracy of traditional user based collaborative filtering techniques under new user cold-start problem a...

متن کامل

A new vector valued similarity measure for intuitionistic fuzzy sets based on OWA operators

Plenty of researches have been carried out, focusing on the measures of distance, similarity, and correlation between intuitionistic fuzzy sets (IFSs).However, most of them are single-valued measures and lack of potential for efficiency validation.In this paper, a new vector valued similarity measure for IFSs is proposed based on OWA operators.The vector is defined as a two-tuple consisting of ...

متن کامل

Robust statistical registration of 3D ultrasound images using texture information

We investigate a new registration method for ultrasound volumes relying on on a statistical texture-based similarity measure. Texture information is given by spatial Gabor filters and represented by statistical kernel-based distributions. The registration similarity measure is then defined as a probabilistic distance, derived from Bhattacharyya coefficient, between two statistical distributions...

متن کامل

An Empirical Comparison of Distance Measures for Multivariate Time Series Clustering

Multivariate time series (MTS) data are ubiquitous in science and daily life, and how to measure their similarity is a core part of MTS analyzing process. Many of the research efforts in this context have focused on proposing novel similarity measures for the underlying data. However, with the countless techniques to estimate similarity between MTS, this field suffers from a lack of comparative...

متن کامل

A new similarity measure between type-2 fuzzy numbers and fuzzy risk analysis

In this paper, we present a revised similarity measure based onChen-and-Chen's similarity measure for fuzzy risk analysis. The revisedsimilarity measure uses the corrected formulae to calculate the centre ofgravity points, therefore it is more  effective than the Chen-and-Chen'smethod. The revised similarity measure can overcome the drawbacks of theexisting methods. We have also proposed a new ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014